Overview

Dataset statistics

Number of variables21
Number of observations32950
Missing cells0
Missing cells (%)0.0%
Duplicate rows9
Duplicate rows (%)< 0.1%
Total size in memory24.2 MiB
Average record size in memory770.4 B

Variable types

Numeric10
Categorical10
Boolean1

Warnings

Dataset has 9 (< 0.1%) duplicate rows Duplicates
emp.var.rate is highly correlated with euribor3m and 1 other fieldsHigh correlation
euribor3m is highly correlated with emp.var.rate and 1 other fieldsHigh correlation
nr.employed is highly correlated with emp.var.rate and 1 other fieldsHigh correlation
previous has 28394 (86.2%) zeros Zeros

Reproduction

Analysis started2021-05-18 08:44:05.201698
Analysis finished2021-05-18 08:44:34.911085
Duration29.71 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

age
Real number (ℝ≥0)

Distinct77
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.04021244
Minimum17
Maximum98
Zeros0
Zeros (%)0.0%
Memory size257.5 KiB
2021-05-18T10:44:35.060684image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile26
Q132
median38
Q347
95-th percentile58
Maximum98
Range81
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.43231316
Coefficient of variation (CV)0.2605458993
Kurtosis0.7718456354
Mean40.04021244
Median Absolute Deviation (MAD)7
Skewness0.777049751
Sum1319325
Variance108.8331579
MonotocityNot monotonic
2021-05-18T10:44:35.371541image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
311547
 
4.7%
321463
 
4.4%
331430
 
4.3%
351426
 
4.3%
361409
 
4.3%
341392
 
4.2%
301381
 
4.2%
291173
 
3.6%
371173
 
3.6%
391159
 
3.5%
Other values (67)19397
58.9%
ValueCountFrequency (%)
174
 
< 0.1%
1823
 
0.1%
1933
 
0.1%
2053
 
0.2%
2189
 
0.3%
22109
 
0.3%
23185
 
0.6%
24370
1.1%
25477
1.4%
26548
1.7%
ValueCountFrequency (%)
982
 
< 0.1%
951
 
< 0.1%
941
 
< 0.1%
923
 
< 0.1%
912
 
< 0.1%
892
 
< 0.1%
8816
< 0.1%
867
< 0.1%
8513
< 0.1%
844
 
< 0.1%

job
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
admin.
8339 
blue-collar
7356 
technician
5426 
services
3161 
management
2343 
Other values (7)
6325 

Length

Max length13
Median length10
Mean length8.955842185
Min length6

Characters and Unicode

Total characters295095
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtechnician
2nd rowunknown
3rd rowblue-collar
4th rowadmin.
5th rowhousemaid
ValueCountFrequency (%)
admin.8339
25.3%
blue-collar7356
22.3%
technician5426
16.5%
services3161
 
9.6%
management2343
 
7.1%
retired1388
 
4.2%
entrepreneur1193
 
3.6%
self-employed1140
 
3.5%
housemaid854
 
2.6%
unemployed793
 
2.4%
Other values (2)957
 
2.9%
2021-05-18T10:44:35.744868image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
admin8339
25.3%
blue-collar7356
22.3%
technician5426
16.5%
services3161
 
9.6%
management2343
 
7.1%
retired1388
 
4.2%
entrepreneur1193
 
3.6%
self-employed1140
 
3.5%
housemaid854
 
2.6%
unemployed793
 
2.4%
Other values (2)957
 
2.9%

Most occurring characters

ValueCountFrequency (%)
e37880
12.8%
n28563
 
9.7%
a26661
 
9.0%
l25141
 
8.5%
i24594
 
8.3%
c21369
 
7.2%
r16872
 
5.7%
m15812
 
5.4%
d13196
 
4.5%
t11714
 
4.0%
Other values (14)73293
24.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter278260
94.3%
Dash Punctuation8496
 
2.9%
Other Punctuation8339
 
2.8%

Most frequent character per category

ValueCountFrequency (%)
e37880
13.6%
n28563
10.3%
a26661
9.6%
l25141
9.0%
i24594
8.8%
c21369
 
7.7%
r16872
 
6.1%
m15812
 
5.7%
d13196
 
4.7%
t11714
 
4.2%
Other values (12)56458
20.3%
ValueCountFrequency (%)
-8496
100.0%
ValueCountFrequency (%)
.8339
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin278260
94.3%
Common16835
 
5.7%

Most frequent character per script

ValueCountFrequency (%)
e37880
13.6%
n28563
10.3%
a26661
9.6%
l25141
9.0%
i24594
8.8%
c21369
 
7.7%
r16872
 
6.1%
m15812
 
5.7%
d13196
 
4.7%
t11714
 
4.2%
Other values (12)56458
20.3%
ValueCountFrequency (%)
-8496
50.5%
.8339
49.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII295095
100.0%

Most frequent character per block

ValueCountFrequency (%)
e37880
12.8%
n28563
 
9.7%
a26661
 
9.0%
l25141
 
8.5%
i24594
 
8.3%
c21369
 
7.2%
r16872
 
5.7%
m15812
 
5.4%
d13196
 
4.5%
t11714
 
4.0%
Other values (14)73293
24.8%

marital
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
married
19966 
single
9242 
divorced
3676 
unknown
 
66

Length

Max length8
Median length7
Mean length6.83107739
Min length6

Characters and Unicode

Total characters225084
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmarried
2nd rowmarried
3rd rowmarried
4th rowmarried
5th rowmarried
ValueCountFrequency (%)
married19966
60.6%
single9242
28.0%
divorced3676
 
11.2%
unknown66
 
0.2%
2021-05-18T10:44:36.095762image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:36.216819image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
married19966
60.6%
single9242
28.0%
divorced3676
 
11.2%
unknown66
 
0.2%

Most occurring characters

ValueCountFrequency (%)
r43608
19.4%
i32884
14.6%
e32884
14.6%
d27318
12.1%
m19966
8.9%
a19966
8.9%
n9440
 
4.2%
s9242
 
4.1%
g9242
 
4.1%
l9242
 
4.1%
Other values (6)11292
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter225084
100.0%

Most frequent character per category

ValueCountFrequency (%)
r43608
19.4%
i32884
14.6%
e32884
14.6%
d27318
12.1%
m19966
8.9%
a19966
8.9%
n9440
 
4.2%
s9242
 
4.1%
g9242
 
4.1%
l9242
 
4.1%
Other values (6)11292
 
5.0%

Most occurring scripts

ValueCountFrequency (%)
Latin225084
100.0%

Most frequent character per script

ValueCountFrequency (%)
r43608
19.4%
i32884
14.6%
e32884
14.6%
d27318
12.1%
m19966
8.9%
a19966
8.9%
n9440
 
4.2%
s9242
 
4.1%
g9242
 
4.1%
l9242
 
4.1%
Other values (6)11292
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII225084
100.0%

Most frequent character per block

ValueCountFrequency (%)
r43608
19.4%
i32884
14.6%
e32884
14.6%
d27318
12.1%
m19966
8.9%
a19966
8.9%
n9440
 
4.2%
s9242
 
4.1%
g9242
 
4.1%
l9242
 
4.1%
Other values (6)11292
 
5.0%

education
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
university.degree
9717 
high.school
7553 
basic.9y
4860 
professional.course
4229 
basic.4y
3333 
Other values (3)
3258 

Length

Max length19
Median length11
Mean length12.71213961
Min length7

Characters and Unicode

Total characters418865
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhigh.school
2nd rowunknown
3rd rowbasic.9y
4th rowhigh.school
5th rowhigh.school
ValueCountFrequency (%)
university.degree9717
29.5%
high.school7553
22.9%
basic.9y4860
14.7%
professional.course4229
12.8%
basic.4y3333
 
10.1%
basic.6y1847
 
5.6%
unknown1396
 
4.2%
illiterate15
 
< 0.1%
2021-05-18T10:44:36.515351image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:36.624113image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
university.degree9717
29.5%
high.school7553
22.9%
basic.9y4860
14.7%
professional.course4229
12.8%
basic.4y3333
 
10.1%
basic.6y1847
 
5.6%
unknown1396
 
4.2%
illiterate15
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e47356
 
11.3%
i41286
 
9.9%
s39997
 
9.5%
.31539
 
7.5%
o29189
 
7.0%
r27907
 
6.7%
h22659
 
5.4%
c21822
 
5.2%
y19757
 
4.7%
n18134
 
4.3%
Other values (15)119219
28.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter377286
90.1%
Other Punctuation31539
 
7.5%
Decimal Number10040
 
2.4%

Most frequent character per category

ValueCountFrequency (%)
e47356
12.6%
i41286
10.9%
s39997
10.6%
o29189
 
7.7%
r27907
 
7.4%
h22659
 
6.0%
c21822
 
5.8%
y19757
 
5.2%
n18134
 
4.8%
g17270
 
4.6%
Other values (11)91909
24.4%
ValueCountFrequency (%)
94860
48.4%
43333
33.2%
61847
 
18.4%
ValueCountFrequency (%)
.31539
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin377286
90.1%
Common41579
 
9.9%

Most frequent character per script

ValueCountFrequency (%)
e47356
12.6%
i41286
10.9%
s39997
10.6%
o29189
 
7.7%
r27907
 
7.4%
h22659
 
6.0%
c21822
 
5.8%
y19757
 
5.2%
n18134
 
4.8%
g17270
 
4.6%
Other values (11)91909
24.4%
ValueCountFrequency (%)
.31539
75.9%
94860
 
11.7%
43333
 
8.0%
61847
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII418865
100.0%

Most frequent character per block

ValueCountFrequency (%)
e47356
 
11.3%
i41286
 
9.9%
s39997
 
9.5%
.31539
 
7.5%
o29189
 
7.0%
r27907
 
6.7%
h22659
 
5.4%
c21822
 
5.2%
y19757
 
4.7%
n18134
 
4.3%
Other values (15)119219
28.5%

default
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
no
26064 
unknown
6883 
yes
 
3

Length

Max length7
Median length2
Mean length3.044552352
Min length2

Characters and Unicode

Total characters100318
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowunknown
3rd rowno
4th rowno
5th rowno
ValueCountFrequency (%)
no26064
79.1%
unknown6883
 
20.9%
yes3
 
< 0.1%
2021-05-18T10:44:36.975662image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:37.076396image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no26064
79.1%
unknown6883
 
20.9%
yes3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n46713
46.6%
o32947
32.8%
u6883
 
6.9%
k6883
 
6.9%
w6883
 
6.9%
y3
 
< 0.1%
e3
 
< 0.1%
s3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter100318
100.0%

Most frequent character per category

ValueCountFrequency (%)
n46713
46.6%
o32947
32.8%
u6883
 
6.9%
k6883
 
6.9%
w6883
 
6.9%
y3
 
< 0.1%
e3
 
< 0.1%
s3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin100318
100.0%

Most frequent character per script

ValueCountFrequency (%)
n46713
46.6%
o32947
32.8%
u6883
 
6.9%
k6883
 
6.9%
w6883
 
6.9%
y3
 
< 0.1%
e3
 
< 0.1%
s3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII100318
100.0%

Most frequent character per block

ValueCountFrequency (%)
n46713
46.6%
o32947
32.8%
u6883
 
6.9%
k6883
 
6.9%
w6883
 
6.9%
y3
 
< 0.1%
e3
 
< 0.1%
s3
 
< 0.1%

housing
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
yes
17232 
no
14936 
unknown
 
782

Length

Max length7
Median length3
Mean length2.641638847
Min length2

Characters and Unicode

Total characters87042
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowyes
3rd rowno
4th rowno
5th rowyes
ValueCountFrequency (%)
yes17232
52.3%
no14936
45.3%
unknown782
 
2.4%
2021-05-18T10:44:37.455151image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:37.563468image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
yes17232
52.3%
no14936
45.3%
unknown782
 
2.4%

Most occurring characters

ValueCountFrequency (%)
n17282
19.9%
y17232
19.8%
e17232
19.8%
s17232
19.8%
o15718
18.1%
u782
 
0.9%
k782
 
0.9%
w782
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter87042
100.0%

Most frequent character per category

ValueCountFrequency (%)
n17282
19.9%
y17232
19.8%
e17232
19.8%
s17232
19.8%
o15718
18.1%
u782
 
0.9%
k782
 
0.9%
w782
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin87042
100.0%

Most frequent character per script

ValueCountFrequency (%)
n17282
19.9%
y17232
19.8%
e17232
19.8%
s17232
19.8%
o15718
18.1%
u782
 
0.9%
k782
 
0.9%
w782
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII87042
100.0%

Most frequent character per block

ValueCountFrequency (%)
n17282
19.9%
y17232
19.8%
e17232
19.8%
s17232
19.8%
o15718
18.1%
u782
 
0.9%
k782
 
0.9%
w782
 
0.9%

loan
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
no
27166 
yes
5002 
unknown
 
782

Length

Max length7
Median length2
Mean length2.27047041
Min length2

Characters and Unicode

Total characters74812
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowyes
2nd rowno
3rd rowno
4th rowno
5th rowno
ValueCountFrequency (%)
no27166
82.4%
yes5002
 
15.2%
unknown782
 
2.4%
2021-05-18T10:44:37.853607image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:37.957169image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no27166
82.4%
yes5002
 
15.2%
unknown782
 
2.4%

Most occurring characters

ValueCountFrequency (%)
n29512
39.4%
o27948
37.4%
y5002
 
6.7%
e5002
 
6.7%
s5002
 
6.7%
u782
 
1.0%
k782
 
1.0%
w782
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter74812
100.0%

Most frequent character per category

ValueCountFrequency (%)
n29512
39.4%
o27948
37.4%
y5002
 
6.7%
e5002
 
6.7%
s5002
 
6.7%
u782
 
1.0%
k782
 
1.0%
w782
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin74812
100.0%

Most frequent character per script

ValueCountFrequency (%)
n29512
39.4%
o27948
37.4%
y5002
 
6.7%
e5002
 
6.7%
s5002
 
6.7%
u782
 
1.0%
k782
 
1.0%
w782
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII74812
100.0%

Most frequent character per block

ValueCountFrequency (%)
n29512
39.4%
o27948
37.4%
y5002
 
6.7%
e5002
 
6.7%
s5002
 
6.7%
u782
 
1.0%
k782
 
1.0%
w782
 
1.0%

contact
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
cellular
20946 
telephone
12004 

Length

Max length9
Median length8
Mean length8.36430956
Min length8

Characters and Unicode

Total characters275604
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcellular
2nd rowtelephone
3rd rowcellular
4th rowtelephone
5th rowcellular
ValueCountFrequency (%)
cellular20946
63.6%
telephone12004
36.4%
2021-05-18T10:44:38.216956image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:38.316361image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
cellular20946
63.6%
telephone12004
36.4%

Most occurring characters

ValueCountFrequency (%)
l74842
27.2%
e56958
20.7%
c20946
 
7.6%
u20946
 
7.6%
a20946
 
7.6%
r20946
 
7.6%
t12004
 
4.4%
p12004
 
4.4%
h12004
 
4.4%
o12004
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter275604
100.0%

Most frequent character per category

ValueCountFrequency (%)
l74842
27.2%
e56958
20.7%
c20946
 
7.6%
u20946
 
7.6%
a20946
 
7.6%
r20946
 
7.6%
t12004
 
4.4%
p12004
 
4.4%
h12004
 
4.4%
o12004
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
Latin275604
100.0%

Most frequent character per script

ValueCountFrequency (%)
l74842
27.2%
e56958
20.7%
c20946
 
7.6%
u20946
 
7.6%
a20946
 
7.6%
r20946
 
7.6%
t12004
 
4.4%
p12004
 
4.4%
h12004
 
4.4%
o12004
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII275604
100.0%

Most frequent character per block

ValueCountFrequency (%)
l74842
27.2%
e56958
20.7%
c20946
 
7.6%
u20946
 
7.6%
a20946
 
7.6%
r20946
 
7.6%
t12004
 
4.4%
p12004
 
4.4%
h12004
 
4.4%
o12004
 
4.4%

month
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
may
11073 
jul
5753 
aug
4882 
jun
4251 
nov
3296 
Other values (5)
3695 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters98850
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmay
2nd rowmay
3rd rowmay
4th rowjun
5th rowjul
ValueCountFrequency (%)
may11073
33.6%
jul5753
17.5%
aug4882
14.8%
jun4251
 
12.9%
nov3296
 
10.0%
apr2108
 
6.4%
oct569
 
1.7%
sep453
 
1.4%
mar421
 
1.3%
dec144
 
0.4%
2021-05-18T10:44:38.596290image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:38.702050image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
may11073
33.6%
jul5753
17.5%
aug4882
14.8%
jun4251
 
12.9%
nov3296
 
10.0%
apr2108
 
6.4%
oct569
 
1.7%
sep453
 
1.4%
mar421
 
1.3%
dec144
 
0.4%

Most occurring characters

ValueCountFrequency (%)
a18484
18.7%
u14886
15.1%
m11494
11.6%
y11073
11.2%
j10004
10.1%
n7547
7.6%
l5753
 
5.8%
g4882
 
4.9%
o3865
 
3.9%
v3296
 
3.3%
Other values (7)7566
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98850
100.0%

Most frequent character per category

ValueCountFrequency (%)
a18484
18.7%
u14886
15.1%
m11494
11.6%
y11073
11.2%
j10004
10.1%
n7547
7.6%
l5753
 
5.8%
g4882
 
4.9%
o3865
 
3.9%
v3296
 
3.3%
Other values (7)7566
7.7%

Most occurring scripts

ValueCountFrequency (%)
Latin98850
100.0%

Most frequent character per script

ValueCountFrequency (%)
a18484
18.7%
u14886
15.1%
m11494
11.6%
y11073
11.2%
j10004
10.1%
n7547
7.6%
l5753
 
5.8%
g4882
 
4.9%
o3865
 
3.9%
v3296
 
3.3%
Other values (7)7566
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII98850
100.0%

Most frequent character per block

ValueCountFrequency (%)
a18484
18.7%
u14886
15.1%
m11494
11.6%
y11073
11.2%
j10004
10.1%
n7547
7.6%
l5753
 
5.8%
g4882
 
4.9%
o3865
 
3.9%
v3296
 
3.3%
Other values (7)7566
7.7%

day_of_week
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
thu
6937 
mon
6802 
tue
6484 
wed
6468 
fri
6259 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters98850
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmon
2nd rowthu
3rd rowfri
4th rowfri
5th rowfri
ValueCountFrequency (%)
thu6937
21.1%
mon6802
20.6%
tue6484
19.7%
wed6468
19.6%
fri6259
19.0%
2021-05-18T10:44:39.087647image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:39.181906image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
thu6937
21.1%
mon6802
20.6%
tue6484
19.7%
wed6468
19.6%
fri6259
19.0%

Most occurring characters

ValueCountFrequency (%)
t13421
13.6%
u13421
13.6%
e12952
13.1%
h6937
7.0%
m6802
6.9%
o6802
6.9%
n6802
6.9%
w6468
6.5%
d6468
6.5%
f6259
6.3%
Other values (2)12518
12.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98850
100.0%

Most frequent character per category

ValueCountFrequency (%)
t13421
13.6%
u13421
13.6%
e12952
13.1%
h6937
7.0%
m6802
6.9%
o6802
6.9%
n6802
6.9%
w6468
6.5%
d6468
6.5%
f6259
6.3%
Other values (2)12518
12.7%

Most occurring scripts

ValueCountFrequency (%)
Latin98850
100.0%

Most frequent character per script

ValueCountFrequency (%)
t13421
13.6%
u13421
13.6%
e12952
13.1%
h6937
7.0%
m6802
6.9%
o6802
6.9%
n6802
6.9%
w6468
6.5%
d6468
6.5%
f6259
6.3%
Other values (2)12518
12.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII98850
100.0%

Most frequent character per block

ValueCountFrequency (%)
t13421
13.6%
u13421
13.6%
e12952
13.1%
h6937
7.0%
m6802
6.9%
o6802
6.9%
n6802
6.9%
w6468
6.5%
d6468
6.5%
f6259
6.3%
Other values (2)12518
12.7%

duration
Real number (ℝ≥0)

Distinct1463
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean257.3352049
Minimum0
Maximum4918
Zeros4
Zeros (%)< 0.1%
Memory size257.5 KiB
2021-05-18T10:44:39.337927image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile36
Q1102
median179
Q3318
95-th percentile746
Maximum4918
Range4918
Interquartile range (IQR)216

Descriptive statistics

Standard deviation257.3316998
Coefficient of variation (CV)0.9999863793
Kurtosis20.16816732
Mean257.3352049
Median Absolute Deviation (MAD)93
Skewness3.24507812
Sum8479195
Variance66219.60371
MonotocityNot monotonic
2021-05-18T10:44:39.506658image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
136144
 
0.4%
85139
 
0.4%
92138
 
0.4%
72137
 
0.4%
124134
 
0.4%
135132
 
0.4%
126132
 
0.4%
97132
 
0.4%
104130
 
0.4%
82129
 
0.4%
Other values (1453)31603
95.9%
ValueCountFrequency (%)
04
 
< 0.1%
13
 
< 0.1%
21
 
< 0.1%
33
 
< 0.1%
410
 
< 0.1%
524
 
0.1%
631
0.1%
738
0.1%
854
0.2%
967
0.2%
ValueCountFrequency (%)
49181
< 0.1%
37851
< 0.1%
36431
< 0.1%
35091
< 0.1%
34221
< 0.1%
33661
< 0.1%
33221
< 0.1%
32841
< 0.1%
32531
< 0.1%
31831
< 0.1%

campaign
Real number (ℝ≥0)

Distinct39
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.561729894
Minimum1
Maximum56
Zeros0
Zeros (%)0.0%
Memory size257.5 KiB
2021-05-18T10:44:39.658986image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile7
Maximum56
Range55
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.763646396
Coefficient of variation (CV)1.078820371
Kurtosis37.7007375
Mean2.561729894
Median Absolute Deviation (MAD)1
Skewness4.791548913
Sum84409
Variance7.6377414
MonotocityNot monotonic
2021-05-18T10:44:39.799950image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
114148
42.9%
28447
25.6%
34277
 
13.0%
42097
 
6.4%
51295
 
3.9%
6776
 
2.4%
7505
 
1.5%
8320
 
1.0%
9217
 
0.7%
10178
 
0.5%
Other values (29)690
 
2.1%
ValueCountFrequency (%)
114148
42.9%
28447
25.6%
34277
 
13.0%
42097
 
6.4%
51295
 
3.9%
6776
 
2.4%
7505
 
1.5%
8320
 
1.0%
9217
 
0.7%
10178
 
0.5%
ValueCountFrequency (%)
561
 
< 0.1%
432
 
< 0.1%
422
 
< 0.1%
402
 
< 0.1%
355
< 0.1%
343
< 0.1%
333
< 0.1%
322
 
< 0.1%
316
< 0.1%
303
< 0.1%

pdays
Real number (ℝ≥0)

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean962.17478
Minimum0
Maximum999
Zeros12
Zeros (%)< 0.1%
Memory size257.5 KiB
2021-05-18T10:44:39.948945image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile999
Q1999
median999
Q3999
95-th percentile999
Maximum999
Range999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation187.6467855
Coefficient of variation (CV)0.1950235959
Kurtosis22.00769747
Mean962.17478
Median Absolute Deviation (MAD)0
Skewness-4.899583916
Sum31703659
Variance35211.31611
MonotocityNot monotonic
2021-05-18T10:44:40.106441image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
99931728
96.3%
3350
 
1.1%
6326
 
1.0%
493
 
0.3%
255
 
0.2%
954
 
0.2%
1248
 
0.1%
747
 
0.1%
539
 
0.1%
1039
 
0.1%
Other values (16)171
 
0.5%
ValueCountFrequency (%)
012
 
< 0.1%
122
 
0.1%
255
 
0.2%
3350
1.1%
493
 
0.3%
539
 
0.1%
6326
1.0%
747
 
0.1%
813
 
< 0.1%
954
 
0.2%
ValueCountFrequency (%)
99931728
96.3%
271
 
< 0.1%
261
 
< 0.1%
222
 
< 0.1%
212
 
< 0.1%
201
 
< 0.1%
193
 
< 0.1%
187
 
< 0.1%
177
 
< 0.1%
168
 
< 0.1%

previous
Real number (ℝ≥0)

ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1747799697
Minimum0
Maximum7
Zeros28394
Zeros (%)86.2%
Memory size257.5 KiB
2021-05-18T10:44:40.240092image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4965033629
Coefficient of variation (CV)2.840733775
Kurtosis19.94757787
Mean0.1747799697
Median Absolute Deviation (MAD)0
Skewness3.808522604
Sum5759
Variance0.2465155894
MonotocityNot monotonic
2021-05-18T10:44:40.355606image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
028394
86.2%
13703
 
11.2%
2603
 
1.8%
3175
 
0.5%
456
 
0.2%
514
 
< 0.1%
64
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
028394
86.2%
13703
 
11.2%
2603
 
1.8%
3175
 
0.5%
456
 
0.2%
514
 
< 0.1%
64
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
71
 
< 0.1%
64
 
< 0.1%
514
 
< 0.1%
456
 
0.2%
3175
 
0.5%
2603
 
1.8%
13703
 
11.2%
028394
86.2%

poutcome
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
nonexistent
28394 
failure
3451 
success
 
1105

Length

Max length11
Median length11
Mean length10.44691958
Min length7

Characters and Unicode

Total characters344226
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfailure
2nd rownonexistent
3rd rowfailure
4th rownonexistent
5th rownonexistent
ValueCountFrequency (%)
nonexistent28394
86.2%
failure3451
 
10.5%
success1105
 
3.4%
2021-05-18T10:44:40.823452image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-18T10:44:40.925597image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
nonexistent28394
86.2%
failure3451
 
10.5%
success1105
 
3.4%

Most occurring characters

ValueCountFrequency (%)
n85182
24.7%
e61344
17.8%
t56788
16.5%
i31845
 
9.3%
s31709
 
9.2%
o28394
 
8.2%
x28394
 
8.2%
u4556
 
1.3%
f3451
 
1.0%
a3451
 
1.0%
Other values (3)9112
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter344226
100.0%

Most frequent character per category

ValueCountFrequency (%)
n85182
24.7%
e61344
17.8%
t56788
16.5%
i31845
 
9.3%
s31709
 
9.2%
o28394
 
8.2%
x28394
 
8.2%
u4556
 
1.3%
f3451
 
1.0%
a3451
 
1.0%
Other values (3)9112
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin344226
100.0%

Most frequent character per script

ValueCountFrequency (%)
n85182
24.7%
e61344
17.8%
t56788
16.5%
i31845
 
9.3%
s31709
 
9.2%
o28394
 
8.2%
x28394
 
8.2%
u4556
 
1.3%
f3451
 
1.0%
a3451
 
1.0%
Other values (3)9112
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII344226
100.0%

Most frequent character per block

ValueCountFrequency (%)
n85182
24.7%
e61344
17.8%
t56788
16.5%
i31845
 
9.3%
s31709
 
9.2%
o28394
 
8.2%
x28394
 
8.2%
u4556
 
1.3%
f3451
 
1.0%
a3451
 
1.0%
Other values (3)9112
 
2.6%

emp.var.rate
Real number (ℝ)

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0762276176
Minimum-3.4
Maximum1.4
Zeros0
Zeros (%)0.0%
Memory size257.5 KiB
2021-05-18T10:44:41.025719image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-3.4
5-th percentile-2.9
Q1-1.8
median1.1
Q31.4
95-th percentile1.4
Maximum1.4
Range4.8
Interquartile range (IQR)3.2

Descriptive statistics

Standard deviation1.572241965
Coefficient of variation (CV)20.62562119
Kurtosis-1.073664391
Mean0.0762276176
Median Absolute Deviation (MAD)0.3
Skewness-0.7169400989
Sum2511.7
Variance2.471944796
MonotocityNot monotonic
2021-05-18T10:44:41.157449image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1.412927
39.2%
-1.87392
22.4%
1.16210
18.8%
-0.12960
 
9.0%
-2.91348
 
4.1%
-3.4854
 
2.6%
-1.7611
 
1.9%
-1.1504
 
1.5%
-3136
 
0.4%
-0.28
 
< 0.1%
ValueCountFrequency (%)
-3.4854
 
2.6%
-3136
 
0.4%
-2.91348
 
4.1%
-1.87392
22.4%
-1.7611
 
1.9%
-1.1504
 
1.5%
-0.28
 
< 0.1%
-0.12960
 
9.0%
1.16210
18.8%
1.412927
39.2%
ValueCountFrequency (%)
1.412927
39.2%
1.16210
18.8%
-0.12960
 
9.0%
-0.28
 
< 0.1%
-1.1504
 
1.5%
-1.7611
 
1.9%
-1.87392
22.4%
-2.91348
 
4.1%
-3136
 
0.4%
-3.4854
 
2.6%

cons.price.idx
Real number (ℝ≥0)

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93.57424343
Minimum92.201
Maximum94.767
Zeros0
Zeros (%)0.0%
Memory size257.5 KiB
2021-05-18T10:44:41.292541image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum92.201
5-th percentile92.713
Q193.075
median93.749
Q393.994
95-th percentile94.465
Maximum94.767
Range2.566
Interquartile range (IQR)0.919

Descriptive statistics

Standard deviation0.5786358031
Coefficient of variation (CV)0.006183708057
Kurtosis-0.8359512246
Mean93.57424343
Median Absolute Deviation (MAD)0.38
Skewness-0.2267829565
Sum3083271.321
Variance0.3348193926
MonotocityNot monotonic
2021-05-18T10:44:41.453728image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
93.9946210
18.8%
93.9185364
16.3%
92.8934687
14.2%
93.4444089
12.4%
94.4653474
10.5%
93.22907
8.8%
93.0751966
 
6.0%
92.201613
 
1.9%
92.963595
 
1.8%
92.431354
 
1.1%
Other values (16)2691
8.2%
ValueCountFrequency (%)
92.201613
 
1.9%
92.379214
 
0.6%
92.431354
 
1.1%
92.469140
 
0.4%
92.649286
 
0.9%
92.713136
 
0.4%
92.7568
 
< 0.1%
92.843212
 
0.6%
92.8934687
14.2%
92.963595
 
1.8%
ValueCountFrequency (%)
94.767103
 
0.3%
94.601162
 
0.5%
94.4653474
10.5%
94.215249
 
0.8%
94.199239
 
0.7%
94.055182
 
0.6%
94.027180
 
0.5%
93.9946210
18.8%
93.9185364
16.3%
93.876176
 
0.5%

cons.conf.idx
Real number (ℝ)

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-40.51867982
Minimum-50.8
Maximum-26.9
Zeros0
Zeros (%)0.0%
Memory size257.5 KiB
2021-05-18T10:44:41.595101image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-50.8
5-th percentile-47.1
Q1-42.7
median-41.8
Q3-36.4
95-th percentile-33.6
Maximum-26.9
Range23.9
Interquartile range (IQR)6.3

Descriptive statistics

Standard deviation4.623004314
Coefficient of variation (CV)-0.1140956303
Kurtosis-0.3569941733
Mean-40.51867982
Median Absolute Deviation (MAD)4.4
Skewness0.310353195
Sum-1335090.5
Variance21.37216888
MonotocityNot monotonic
2021-05-18T10:44:41.749900image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
-36.46210
18.8%
-42.75364
16.3%
-46.24687
14.2%
-36.14089
12.4%
-41.83474
10.5%
-422907
8.8%
-47.11966
 
6.0%
-31.4613
 
1.9%
-40.8595
 
1.8%
-26.9354
 
1.1%
Other values (16)2691
8.2%
ValueCountFrequency (%)
-50.8103
 
0.3%
-50212
 
0.6%
-49.5162
 
0.5%
-47.11966
 
6.0%
-46.24687
14.2%
-45.98
 
< 0.1%
-42.75364
16.3%
-422907
8.8%
-41.83474
10.5%
-40.8595
 
1.8%
ValueCountFrequency (%)
-26.9354
 
1.1%
-29.8214
 
0.6%
-30.1286
 
0.9%
-31.4613
 
1.9%
-33136
 
0.4%
-33.6140
 
0.4%
-34.6142
 
0.4%
-34.8209
 
0.6%
-36.14089
12.4%
-36.46210
18.8%

euribor3m
Real number (ℝ≥0)

HIGH CORRELATION

Distinct314
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.615653596
Minimum0.634
Maximum5.045
Zeros0
Zeros (%)0.0%
Memory size257.5 KiB
2021-05-18T10:44:41.918545image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0.634
5-th percentile0.797
Q11.344
median4.857
Q34.961
95-th percentile4.966
Maximum5.045
Range4.411
Interquartile range (IQR)3.617

Descriptive statistics

Standard deviation1.73574798
Coefficient of variation (CV)0.4800647889
Kurtosis-1.416898308
Mean3.615653596
Median Absolute Deviation (MAD)0.108
Skewness-0.7021616926
Sum119135.786
Variance3.012821051
MonotocityNot monotonic
2021-05-18T10:44:42.076739image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.8572271
 
6.9%
4.9622061
 
6.3%
4.9631968
 
6.0%
4.9611512
 
4.6%
4.856983
 
3.0%
4.964932
 
2.8%
1.405924
 
2.8%
4.864849
 
2.6%
4.965845
 
2.6%
4.96811
 
2.5%
Other values (304)19794
60.1%
ValueCountFrequency (%)
0.6346
 
< 0.1%
0.63532
0.1%
0.63610
 
< 0.1%
0.6376
 
< 0.1%
0.6386
 
< 0.1%
0.63916
< 0.1%
0.649
 
< 0.1%
0.64232
0.1%
0.64320
0.1%
0.64429
0.1%
ValueCountFrequency (%)
5.0457
 
< 0.1%
57
 
< 0.1%
4.97142
 
0.4%
4.968788
 
2.4%
4.967518
 
1.6%
4.966489
 
1.5%
4.965845
2.6%
4.964932
2.8%
4.9631968
6.0%
4.9622061
6.3%

nr.employed
Real number (ℝ≥0)

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5166.859608
Minimum4963.6
Maximum5228.1
Zeros0
Zeros (%)0.0%
Memory size257.5 KiB
2021-05-18T10:44:42.404208image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum4963.6
5-th percentile5017.5
Q15099.1
median5191
Q35228.1
95-th percentile5228.1
Maximum5228.1
Range264.5
Interquartile range (IQR)129

Descriptive statistics

Standard deviation72.20844837
Coefficient of variation (CV)0.01397530683
Kurtosis-0.01823278267
Mean5166.859608
Median Absolute Deviation (MAD)37.1
Skewness-1.03710511
Sum170248024.1
Variance5214.060016
MonotocityNot monotonic
2021-05-18T10:44:42.610057image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
5228.112927
39.2%
5099.16865
20.8%
51916210
18.8%
5195.82960
 
9.0%
5076.21348
 
4.1%
5017.5854
 
2.6%
4991.6611
 
1.9%
5008.7527
 
1.6%
4963.6504
 
1.5%
5023.5136
 
0.4%
ValueCountFrequency (%)
4963.6504
 
1.5%
4991.6611
 
1.9%
5008.7527
 
1.6%
5017.5854
 
2.6%
5023.5136
 
0.4%
5076.21348
 
4.1%
5099.16865
20.8%
5176.38
 
< 0.1%
51916210
18.8%
5195.82960
9.0%
ValueCountFrequency (%)
5228.112927
39.2%
5195.82960
 
9.0%
51916210
18.8%
5176.38
 
< 0.1%
5099.16865
20.8%
5076.21348
 
4.1%
5023.5136
 
0.4%
5017.5854
 
2.6%
5008.7527
 
1.6%
4991.6611
 
1.9%

y
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size32.3 KiB
False
29258 
True
3692 
ValueCountFrequency (%)
False29258
88.8%
True3692
 
11.2%
2021-05-18T10:44:42.756112image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Interactions

2021-05-18T10:44:16.166659image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:16.427316image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:16.638897image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:16.857669image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:17.076372image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:17.267596image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:17.486835image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:17.689885image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:17.892257image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:18.113502image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:18.314444image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:18.501232image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:18.673429image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:18.844787image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:19.023105image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:19.208538image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:19.386345image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:19.547818image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:19.733526image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:19.946059image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:20.139107image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:20.335091image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:20.525561image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:20.719787image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:20.892762image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:21.063364image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:21.324667image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:21.501786image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:21.715786image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:21.898843image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:22.065417image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:22.250707image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:22.423475image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:22.609789image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:22.806942image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:22.994956image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:23.170033image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:23.381410image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:23.568528image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:23.761708image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:23.951339image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:24.127449image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:24.321599image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:24.514983image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:24.708561image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:24.920723image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:25.119648image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:25.321385image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:25.531995image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:25.759538image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:25.987504image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:26.205271image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:26.400975image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:26.609511image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:26.807642image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:27.115438image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:27.405439image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:27.682045image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:28.127493image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:28.461977image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:28.733927image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:28.959002image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:29.163636image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:29.348664image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:29.543061image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:29.732897image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:29.906480image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:30.071719image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:30.252709image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:30.413528image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:30.579024image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:30.742411image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:30.905976image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:31.087231image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:31.251734image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:31.423452image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:31.582226image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:31.747529image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:31.902982image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:32.059238image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:32.206486image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:32.358147image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:32.526676image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:32.680102image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:32.834281image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:32.977696image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:33.140854image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:33.284413image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:33.428269image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-18T10:44:33.570419image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2021-05-18T10:44:42.905640image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-18T10:44:43.264054image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-18T10:44:43.578709image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-18T10:44:43.912668image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-05-18T10:44:44.309608image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-18T10:44:33.960366image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-18T10:44:34.582033image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

agejobmaritaleducationdefaulthousingloancontactmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedy
057technicianmarriedhigh.schoolnonoyescellularmaymon37119991failure-1.892.893-46.21.2995099.1no
155unknownmarriedunknownunknownyesnotelephonemaythu28529990nonexistent1.193.994-36.44.8605191.0no
233blue-collarmarriedbasic.9ynononocellularmayfri5219991failure-1.892.893-46.21.3135099.1no
336admin.marriedhigh.schoolnononotelephonejunfri35549990nonexistent1.494.465-41.84.9675228.1no
427housemaidmarriedhigh.schoolnoyesnocellularjulfri18929990nonexistent1.493.918-42.74.9635228.1no
558retiredmarriedprofessional.coursenoyesyescellularjulfri60519990nonexistent1.493.918-42.74.9625228.1no
648servicesmarriedhigh.schoolunknownyesnotelephonemaywed24319990nonexistent1.193.994-36.44.8565191.0no
751admin.divorceduniversity.degreeunknownyesnocellularaugthu2479990nonexistent1.493.444-36.14.9625228.1no
824entrepreneurmarrieduniversity.degreenoyesyestelephonejunwed12649990nonexistent1.494.465-41.84.9625228.1no
936techniciandivorcedprofessional.coursenoyesyescellularjulmon4349990nonexistent1.493.918-42.74.9625228.1no

Last rows

agejobmaritaleducationdefaulthousingloancontactmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedy
3294030admin.singleprofessional.coursenononocellularaugtue9779990nonexistent1.493.444-36.14.9635228.1no
3294154admin.marrieduniversity.degreenoyesnocellularaugfri91231success-2.992.201-31.40.8495076.2yes
3294247entrepreneurmarrieduniversity.degreenoyesnotelephonejunmon8229990nonexistent1.494.465-41.84.9615228.1no
3294325blue-collarmarriedbasic.9yunknownyesnotelephonemaywed48819990nonexistent1.193.994-36.44.8585191.0no
3294449blue-collarmarriedhigh.schoolunknownyesnocellularaugthu19919990nonexistent1.493.444-36.14.9635228.1no
3294556housemaidmarriedbasic.4ynonoyescellularjulmon11619990nonexistent1.493.918-42.74.9605228.1no
3294637managementmarrieduniversity.degreenonoyescellularjulfri6979990nonexistent1.493.918-42.74.9575228.1no
3294726admin.singleuniversity.degreenononocellularmaytue13549991failure-1.892.893-46.21.2665099.1no
3294831blue-collarsinglebasic.9ynononocellularaprmon38619990nonexistent-1.893.075-47.11.4055099.1no
3294939housemaidmarriedbasic.4ynononocellularaugthu17919990nonexistent1.493.444-36.14.9635228.1no

Duplicate rows

Most frequent

agejobmaritaleducationdefaulthousingloancontactmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedycount
027techniciansingleprofessional.coursenononocellularjulmon33129990nonexistent1.493.918-42.74.9625228.1no2
132techniciansingleprofessional.coursenoyesnocellularjulthu12819990nonexistent1.493.918-42.74.9685228.1no2
235admin.marrieduniversity.degreenoyesnocellularmayfri34849990nonexistent-1.892.893-46.21.3135099.1no2
336retiredmarriedunknownnononotelephonejulthu8819990nonexistent1.493.918-42.74.9665228.1no2
439blue-collarmarriedbasic.6ynononotelephonemaythu12419990nonexistent1.193.994-36.44.8555191.0no2
541technicianmarriedprofessional.coursenoyesnocellularaugtue12719990nonexistent1.493.444-36.14.9665228.1no2
645admin.marrieduniversity.degreenononocellularjulthu25219990nonexistent-2.992.469-33.61.0725076.2yes2
755servicesmarriedhigh.schoolunknownnonocellularaugmon3319990nonexistent1.493.444-36.14.9655228.1no2
871retiredsingleuniversity.degreenononotelephoneocttue12019990nonexistent-3.492.431-26.90.7425017.5no2